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Abstract. 

In this paper, we study two sets of local geomagnetic indices from 26 sta¬ 
tions using the principal component (PC) and the independent component 
(IC) analysis methods. We demonstrate that the annually averaged indices 
can be accurately represented as linear combinations of two hrst components 
with weights systematically depending on latitude. We show that the annual 
contributions of coronal mass ejections (CMEs) and high speed streams (HSSs) 
to geomagnetic activity are highly correlated with the hrst and second IC. 

The hrst and second ICs are also found to be very highly correlated with the 
strength of the interplanetary magnetic held (IMF) and the solar wind speed, 
respectively, because solar wind speed is the most important parameter driv¬ 
ing geomagnetic activity during HSSs while IMF strength dominates dur¬ 
ing CMFs. These results help in better understanding the long-term driv¬ 
ing of geomagnetic activity and in gaining information about the long-term 
evolution of solar wind parameters and the diherent solar wind structures. 
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1. Introduction 

Geomagnetic activity is produced in the interaction between the solar wind and the 
Earth’s magnetic held. It has been studied systematically since the late 19th century using 
different geomagnetic indices. Most common geomagnetic indices are global indices such 
as aa, Kp/Ap, Dst and AE, which are constructed from local indices, e.g., as weighted 
or normalized averages. For example, the Kp index is calculated from local K indices 
of 13 magnetic observatories located at midlatitudes and subauroral latitudes. Local 
geomagnetic indices are mainly used to derive global indices but the differences between 
local indices are rarely studied. This is surprising since there are over 200 magnetic 
observatories around the world continuously producing magnetic measurements, but the 
state of the Earth’s magnetic field is often described by just one globally averaged number. 

It has been known for a long time that global geomagnetic activity (measured, e.g., 
by the aa index) exhibits a dual peak structure during the solar cycle [Chapman and 
Bartels, 1940; Newton, 1948], the first peak during the solar maximum dominated by 
transient activity and the second peak during the declining phase related to recurrent 
activity. Later it became clear that the first peak is mainly produced by coronal mass 
ejections (CMEs) and the second peak mainly by high speed streams (HSSs) [Simon and 
Legrand, 1986; Gosling et ai, 1991]. It is now known that there are significant differ¬ 
ences between CME and HSS-related geomagnetic activities. E.g., CMEs are responsible 
for the largest geomagnetic storms [Borovsky and Denton, 2006] while HSSs dominate 
substorm activity [Tanskanen et ai, 2005]. Because of these differences, one can expect 
that average geomagnetic activity over suitably long time intervals can be decomposed 
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into two components, one related to CME activity and the other related to HSS activity. 
Richardson et al. [2000, 2002] have identihed times when CMEs and HSSs were present 
in the solar wind at 1 AU and studied the contributions of CMEs and HSSs to the aa 
index. They found that during solar maximum most aa activity is related to CMEs while 
during declining phase and solar minimum most aa activity is related to HSSs. Feynman 
[1982] decomposed the annual aa index into two components, the ‘R’ component being 
linearly related to the sunspot number and the residual ‘I’ component dehned as I = aa 
- R. While the R component is mainly produced by the CMEs the I component is more 
closely related to HSSs. This decomposition is reasonable, but it assumes, e.g., that the 
CME contribution to geomagnetic activity strictly follows the sunspot number, which is 
poorly valid around solar maxima [Richardson and Cane, 2012]. 

Recently we used the principal component analysis (PCA) method to extract informa¬ 
tion on the solar wind drivers of annually averaged geomagnetic activity using a set of 
local Ah indices [Holappa et al, 2014]. We found that the hrst principal component (PCI) 
represents the global average of the Ah indices and correlates almost perfectly with the Ap 
index and that the second principal component (PC2) highly correlates with the annual 
fraction of high speed streams in the solar wind. The PCA method, however, does not 
decompose geomagnetic activity into pure CME and HSS components. For example, the 
hrst PC representing global geomagnetic activity is a mixture of CME and HSS effects, 
which both contribute signihcantly to global geomagnetic activity [Richardson and Cane, 
2012 ]. 

In this paper we develop the method further and show that the spatio-temporal infor¬ 
mation included in local indices of geomagnetic activity can be used to extract information 
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about the independent contributions of HSSs and CMEs on geomagnetic activity with¬ 
out any external information about, e.g., solar activity or solar cycle phase. We also 
use this information to study the contributions of the two main solar wind parameters, 
the solar wind speed and the interplanetary magnetic held (IMF) intensity, to geomag¬ 
netic activity. This paper is organized as follows. Section 2 introduces the Ah and IHV 
(Inter-hourly Variability) indices used in this study. In Section 3 the principal component 
analysis (PCA) method that we used earlier [Holappa et ai, 2014] is briehy reviewed and 
applied to Ah and IHV indices. The principal components are then processed using the 
independent component analysis (ICA) method in Section 4. The relation of the two hrst 
independent components (ICs) to solar wind speed and IMF intensity, as well as to CMF 
and HSS fractions is discussed in Section 5. Finally, conclusions are given in Section 6. 

2. Local geomagnetic indices and other data 

We use two different measures of local geomagnetic activity: the Ah index [Mursula and 
Martini, 2007] and the IHV index [Svalgaard and Oliver, 2007]. The three-hourly Ah index 
is analogous to Ak, the linearized K index [Bartels et ai, 1939], calculated from hourly 
data as the range of variation of the local horizontal magnetic held after removing the 
quiet day (Sq) variation. However, the quiet day variation cannot be fully removed from 
the data by any method and some amount of residual quiet day variation also remains in 
the Ah indices. In order to exclude the possibility that the residual Sq variation ahects 
our results based on the Ah indices we also use IHV indices which are calculated using 
only local night sector data and are thus practically unahected by Sq variation. The 
daily IHV index is dehned as the average of six absolute hourly diherences of the local 
horizontal magnetic held around local midnight [Svalgaard and Oliver, 2007]. 
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We use the Ah and the IHV indices of the 26 observatories listed in Table 1. The 
selection criteria for stations was high quality and long-term continuity of their data 
sets and good global coverage. We only selected stations which have less than 20% of 
data missing for any year. We calculated the Ah and IHV indices for 1966-2011 (46 
years) using hourly mean data obtained from World Data Center of Edinburgh [WDC- 
C1, 2011]. Before calculating the indices, we checked the baselines and excluded the 
outliers from the magnetic data by using a three-point median hlter (for more details, 
see Holappa et ai, 2014). We also rescaled the Ah and IHV indices of the CLF station 
for years 1966-1971 because CLF recorded spot values instead of hourly means until the 
end of 1971, leading to excessively large Ah and IHV values in these years. For this, we 
calculated the averages of the ratios %/i(CLF)/yl/j(NGK) and IHV(CLF)/1HV(NGK) in 
1972-1981 and in 1962-1971 and multiplied 2lft(CLF) and IHV{CLF) before 1971 by the 
corresponding ratios (0.8146 and 0.7736, respectively) so that the %/i(CLF)/%/i(NGK) 
and IHV(CLF)/1HV(NGK) ratios became continuous. (Note that NGK and GLF are 
geographically close to each other, which allows a meaningful comparison between the two 
stations.) 

In addition to the magnetic data of ground stations, we use solar wind data from the 
OMNI database (http://omniweb.gsfc.nasa.gov/) and the classihcation of solar wind 
flow types by Richardson and Cane [2012]. There are three different solar wind types 
identihed by Richardson and Cane [2012]: GMEs (including the cores of interplanetary 
GMEs and their related shocks and sheath regions), HSSs (corotating streams from coronal 
holes) and slow solar wind. 
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3. Principal component analysis method 

Principal component analysis [Jollijfe, 2005] is a statistical method, which can be used 
to represent a large number of correlated variables as linear combinations of a few uncor¬ 
related variables called principal components. Here we apply PCA for annual means (46 
years) of geomagnetic indices from 26 observatories. Before evaluating PCA we calculate 
the standardized annual means for each station separately 


Ahf, = 


Ah — (Ah) 


a 


( 1 ) 


where (Ah) is the mean and a the standard deviation of the annually averaged Ah. We 
calculate the standardized annual mean IHVg indices in the same way. Standardized 
annual means are then collected into the columns of the data matrix X (size 46 x 26). 
PCA can be evaluated using the singular value decomposition of the data matrix (see, 
e.g., Hannachi et ai, 2007) 


X = UDV^, (2) 

where U and V are orthogonal matrices {UU'^ = I and VV'^ = I) and D = 
diag(Ai, A 2 ,..., A 26 ) contains the so called singular values of the matrix X. The column 
vectors of the 26 x 26 matrix V are called here the empirical orthogonal functions (EOFs). 
The principal components are obtained as the column vectors of the 46 x 26 matrix 


P = UD. 


( 3 ) 
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The original variables can then be approximated as a linear combination of the K hrst 
principal components with weights given by EOFs as 

A'« = X P.tV,k (4) 

k=l 

where Xij is the value (standardized Ah index) of the jth variable (station) at the obser¬ 
vation time (year) i. The variance of the kth PC is proportional to A|. Hence, the K hrst 
PCs include the following percentage 


X2 

2-^k=l '^k 

Y ^26 \2 

l^k=l 


■ 100 % 


of the variance in the original variables. 


( 5 ) 


3.1. The first PC 

Figure 1 shows the hrst principal components of the Ahs and IHVg indices (to be called 
PCl(Ah) and PCl(IHV)). One can see that there is an excellent agreement between 
the PCls of the two indices. The respective EOFl(Ah) and EOFl(IHV) depicted in 
Figure 2 describe the latitudinal modes associated with the PCls. As we found earlier 
[Holappa et ai, 2014], EOFl(Ah) is almost hat (independent of latitude), meaning that 
all stations contribute with roughly equal weights to PCI. Hence, the PCl(Ah) is very 
closely proportional to the average of the 26 Ahs indices. Also the EOFl(IHV) is almost 
hat except for a small local minimum at the poleward boundary of the auroral oval 
(stations ^(124 and #25). 

The PCl(Ah) and the PCl(IHV) correlate almost perfectly with the annual averages 
of the Ap index of the global geomagnetic activity (Pearson correlation coefficients and 
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p-values for zero correlation from Student’s t-test: cc(Ah) = 0.99, p = 6.4-10“^^; cc(IHV) 
= 0.98, p = 2.2 ■ 10“^^) which is also shown in Fig. 1. Thus, the PCl(Ah) and PCl(IHV) 
also closely represent the mean global geomagnetic activity. The PCI (Ah) and PCl(IHV) 
already explain a large fraction of variance of the Ahs (95.6%) and the IHVg indices 
(90.1%). Thus, at the annual timescale all stations at different latitudes observe roughly 
the same (mainly solar cycle related) long-term variation of geomagnetic activity. 

3.2. The second PC 

PC2(Ah) and PC2(IHV) are shown in Figure 3 and the associated EOF2(Ah) and 
EOF2(IHV) in Figure 2. As described above, the hrst PCs practically represent the annual 
global averages of the two indices. Therefore, the second PCs describe how these local 
indices at the individual stations deviate on an average from their global averages. For 
years of positive PC2(Ah) (PC2(IHV), respectively), the Ahs {IHVg) indices of stations 
with positive (negative) EOF2 coefficients are higher (lower) than the globally averaged 
Ahg [IHVg), and vice versa for years of negative PC2 values. This is demonstrated in 
Figure 4a which shows the difference between the Ahs index of FCC station and the 
average of all 26 Ahs indices. For any year, A/j(FCC) is expected to depart from the mean 
of all Ahs indices by PC2(Ah) times the EOF2 coefficient for FCC (EOF2(FCC) = 0.41). 
The 2nd PC scaled by 0.41 (also shown in Fig. 4a) indeed explains the annual differences 
between the mean Ahs and Ah(FCC) very well. Figure 4b shows an analogous difference 
for IHV(FCC). One can see that PC2(IHV) scaled by 0.50 (EOF2(IHV) = 0.50 for FCC) 
explains the annual differences between IHV(FCC) and the global IHVs very well. 

Note that the PC2 only explains 1.8% (4.9%) of the total variance of the Ahs indices 
{IHVs indices). Therefore, the annual deviations of individual station indices from the 
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global average are not very large especially for stations whose EOF2 coefficients are close 
to zero (see Fig. 2). However, the auroral stations at 65° — 75° CGM latitudes (like FCC) 
with the greatest positive EOF2 coefficients can notably differ from the global average. 
For example, the absolute difference between IHVs{FCC) and the global mean of IHVg 
indices (Fig. 4b) can be more than one (standard deviation), which is a large difference 
for annual means. 

As noted earlier [Holappa et ai, 2014] PC2(Ah) is very highly correlated (cc = 0.82; p 
= 4.6 ■ 10“^^) with the annual time fraction of high-speed streams in solar wind. This can 
also be seen in Fig. 3 which shows the annual fraction of HSSs in solar wind according to 
the classihcation of solar wind into three flow types [Richardson and Cane, 2012]. Figure 
3 also shows the corresponding annual fractions of CMEs which are highly anticorrelated 
with the HSS fractions. Consequently, PC2(Ah) is anticorrelated with the CME fraction 
(cc = -0.67; p = 3.5 ■ 10“^). PC2(IHV) is also very highly correlated with the HSS fraction 
(cc = 0.79, p = 8.2 ■ 10“^^) and anticorrelated with the CME fraction (cc = -0.83; p = 

9 . 7 - 10 - 13 ) 

Figure 5 shows the averages of the Ahs and IHVg indices during CMEs and HSSs. 
Averages of the standardized three-hourly values of indices were calculated over those 
three-hour periods when only one solar wind type (CME or HSS) was present in the 
solar wind. Similarly, averages of the standardized daily values of the IHV indices were 
calculated over those local nights when only one solar wind type was present. As seen 
in Fig. 5, there are clear latitudinal patterns in the A^g and IHVg indices during CMEs 
and HSSs. One can note the high similarity between the EOF2(Ah) (see Fig. 2a) and the 
distribution of the Ahs indices during HSSs (Fig. 5a). The distribution of the Ahs indices 
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during CMEs is almost the mirror image of the HSS distribution. The IHVg indices 
during CMEs and HSSs (Fig. 5b) show roughly the same patterns as the corresponding 
Ahs indices. Also the EOF2(IHV) (see Fig. 2b) resembles the EOF2(Ah) (see Fig. 2a) 
and matches with the distribution of the IHVg indices during HSSs (see Fig. 5b). 

Because the second PCs of the Ahs and IHVg indices correlate (anticorrelate) with the 
HSS (CME) fraction and the second EOFs match with the latitudinal distributions of 
the indices during HSSs (CMEs), one can conclude that PC2 is (mainly) caused by the 
latitudinally different response of local geomagnetic activity to CMEs and HSSs. Figure 
5 shows that during HSSs the strongest values of Ahs and IHVs indices are found at 
the auroral latitudes (65° — 75°) while during CMEs the Ahs and IHVs indices have a 
(local) maximum at subauroral latitudes (55° — 63°). We showed earlier [Holappa et ai, 
2014] that the relative contribution of HSS driven substorms maximizes at the auroral lati¬ 
tudes while the relative effect of CME driven substorms maximizes at subauroral latitudes 
(where substorms are observed especially during magnetic storms [Tanskanen et ai, 2002; 
Hoffman et ai, 2010]), which explains the subauroral minimum and the auroral maximum 
of EOF2. Since IHVs indices only measure geomagnetic activity at the night sector, i.e., 
at the preferred local time (LT) sector of substorms, they are more sensitive to substorms 
(and therefore to HSSs) than the Ahs indices. This explains the slightly larger variation of 
IHVs (HSSs) between the auroral maximum and the subauroral minimum (see Fig. 5b). 
This also explains why EOF2(IHV) shows a higher auroral maximum than EOF2(Ah) 
(see Fig. 2 and discussion later). 

4. Independent component analysis method 
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The basic idea of the independent component analysis (ICA) is analogous to that of the 
principal component analysis: initially dependent variables are presented as a linear com¬ 
bination of statistically independent components. There are numerous ways to perform 
ICA (see, e.g., Hyvdrinen et ai, 2001), but we use here the FastICA software package 
{Hyvdrinen [1999], http://research.ics.aalto.fi/ica/fastica/). 

While the principal components obtained by the PCA method are uncorrelated, they 
are not necessarily statistically independent. Actually, only if the principal components 
are Gaussian their uncorrelatedness also guarantees their statistical independence. To see 
if the two first principal components are independent or not, we first standardize them 
to unit variance by dividing them by their standard deviations ui and <72. Using matrix 
notation the standardized PCs are the columns of the matrix 


Ps = P 2 Z ( 6 ) 

where the 46 x 2 matrix P 2 contains the two hrst columns of the matrix P of Eq. 3 
and Z = diag{ai^, a 2 ^)- Figure 6 shows a scatter plot of the standardized PCl(Ah) and 
PC2(Ah). If the two PCs were statistically independent, the scatter pattern would be 
spherically symmetric. Clearly this is not the case. One can see, e.g., that a positive 
value of PCI implies either a large positive or a large negative value of PC2, and a 
negative value of PCI implies a small value of PC2. The idea of the IC analysis is to hnd 
an orthogonal rotation of the principal components that makes the rotated components 
statistically as independent as possible. The rotation of the principal components can 
written as 
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5 = APj, (7) 

where the orthogonal 2x2 matrix A is the so called mixing matrix and the rows of 
2 X 46 matrix S contain the independent components (with unit variances). The ICA 
algorithm hnds the matrix A in an iterative process by minimizing the entropies of the 
independent components. The independent components are maximally non-Gaussian, 
because the Gaussian distribution has the greatest entropy among all distributions with 
the same variance. 

The principal components are projected onto the basis defined by the row vectors of 
the matrix A which are shown in Figure 6 as IGl and IG2. The matrix A calculated for 
Ahs indices performs a clockwise rotation by 37.4°, whence IGl(Ah) = 0.79- PGls(Ah) - 
0.61- PG2s(Ah) and IG2(Ah) = 0.61- PGls(Ah) + 0.79- PG 25 (Ah). For the IHVg indices 
the rotation angle is 52.3°, whence IGl(IHV) = 0.6T PGl<i(IHV) - 0.79- PG2s(IHV) and 
IG2(IHV) = 0.79- PGG(IHV) + 0.61- PG2«(IHV). 

Using Equations 6 and 7 the the original data matrix can be approximated as 

X = = PsZ-^^ = S'^AZ-W^, (8) 

where the row vectors in the matrix AZ~^V'^ can be interpreted as the spatial modes (SM) 
corresponding to the two independent components (in analogous way with the matrix V'^ 
in Eq. 2). These spatial modes obtained by rotation from the EOFs in V, but they are 
not orthogonal because the matrix AZ~^ is not orthogonal due to the different variances 
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of the principal components. Equation 8 is analogous to Eq. 4 and simply states that the 
original data can be represented as the following linear combination 


Xij = ICl(f) ■ SMl(j) + IC2(i) ■ SM2(j), (9) 

where ICl(i) and lC2{i) are the two independent components for year i and SMl(j) and 
SM2(j) are the corresponding spatial mode coefficients for station j. 

ICA could also be directly applied to the original data matrix, but the ensuing ICs are 
not ordered according to decreasing (or increasing) importance (fraction of total variance) 
and thereby do not reflect the physically most important processes. Rather, in this case, 
ICA tends to emphasize spikes in the data, which are highly non-Gaussian, but misses the 
physically relevant patterns. Instead, reducing Erst the dimension of the data by including 
only the two leading PCs in the ICA makes the two ICs also to include a large fraction 
(95%) of variance and the important physics. 

4.1. The first and second IC 

Figures 7 and 8 show the first and second independent components for the two indices, 
respectively. One can see that the ICs of the two indices are very similar with each other, 
as expected from the similarity of the two hrst PCs of these indices. The correlations 
between the ICs of the two indices are very high: cc(ICl(Ah), ICl(IHV)) = 0.95, p = 
4.9 ■ 10-23 and cc(IC2(Ah), IC2(IHV)) = 0.94, p = 1.1 ■ lO-^T 

The spatial modes corresponding to the two ICs are depicted in Figure 9. One can 
see that the two spatial modes are almost mirror images of each other for both indices, 
especially for A^s- However, the first spatial mode of IHV shows a very deep minimum at 
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auroral latitudes, which is also related to the dip in EOFl(IHV) (Fig. 2). Note also that 
the SM2(IHV) is generally larger than SMl(IHV). This means that the second IC has, on 
the average, a higher weight in the IHV indices than the hrst IC. This is opposite to 
indices for which the hrst IC is dominating. 

5. Relation to solar wind and IMF 

Annual averages of the IMF intensity B and the solar wind speed v are plotted in Figures 
7c and 8c, respectively. One can see that ICl(Ah) and ICl(IHV) are very highly correlated 
with the IMF intensity B with cc(ICl(Ah), B) = 0.90; p = 4.2 ■ 10“^^ and cc(ICl(IHV), 
B) = 0.85; p = 1.8 ■ 10“^^. The second ICs are, in turn, very highly correlated with the 
solar wind speed v: cc(IC2(Ah), v) = 0.82; p = 4.7 ■ 10“^^ and cc(IC2(IHV), v) = 0.89; 
p = 5.0 ■ 10“^®, or alternatively with cc(IC2(Ah), = 0.81; p = 6.2 ■ 10“^^ and 
cc(IC2(IHV), = 0.89; p = 2.8 ■ 10“^®. These correlations and the above ICA results 
expressed in Fq. (9) suggest that the annual averages of all local geomagnetic indices 
can be represented as a linear combination of the annual solar wind speed and the IMF 
strength with their own optimum relative weights for these two drivers. 

Before presenting the results we note that, of course, it is not physically reasonable that 
momentary geomagnetic activity should depend on a linear combination of B and v (or u^). 
Rather, the relation between geomagnetic activity and solar wind parameters is usually 
expressed in terms of different nonlinear coupling functions, e.g., Bv^. There are also many 
coupling functions involving, e.g., solar wind density and IMF vector orientation, but at 
the annual timescale they do not correlate any better with global geomagnetic activity 
than the simple function Bv^ [Finch and Lockwood, 2007]. The above ICA results and the 
earlier results regarding the nonlinear solar wind coupling functions can be understood 
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as follows. During CMEs the coupling function Bv^ is mainly enhanced above the mean 
value due to large values of B, with v remaining at the average level, while during HSSs 
the high values of Bv^ are due to persistently high values of u, with B attaining average 
values [Richardson et ah, 2002; Richardson and Cane, 2012], 

To further test this hypothesis, we decompose hourly B and values into constant and 
fluctuating parts: B = Bq + B' and = Ug + where Bq and Ug denote the averages 
of B and in 1966-2011. Now we can write 


Bv^ = Bovl + B'vl + Bo{v^y + B\v^)'. (10) 

The first term on the right hand side determines the average value of the coupling function 
over the 46 year period {BqVq = 1.3 ■ 10® nT-km^/s^, Bq = 6.4 nT, ug = 439km/s), which, 
however, does not affect, e.g., the correlation between the coupling function and geomag¬ 
netic activity. Figure 10a shows the annual averages of the three last time-dependent 
terms on the right hand side of Eq. 10 including all solar wind data. One can see that 
the third term is overall rather small, suggesting that the fluctuations B' and (u^)' 

(and in fact also B and are rather uncorrelated. This also leads to the fact that, at 
the annual time scale, the functional form of the coupling function Bv^ can indeed be ef¬ 
fectively represented as a linear combination of B and Hence, at the annual timescale, 
geomagnetic activity has two components, one correlated with the IMF strength and the 
other with the solar wind speed. Both fluctuating terms in Fig. 10a have approximately 
the same range of variation meaning that B and contribute to the variations of the 
coupling function Bv^ roughly equally. 
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Figures 10b and 10c show the annual averages of the three time-dependent terms of Eq. 
10 during CMEs and HSSs, respectively. One can see that the term B'vq clearly dominates 
over the two other terms during CMEs while the term dominates during HSSs. 

Therefore, the IME strength is indeed the dominant parameter driving global geomagnetic 
activity during CMEs, while the solar wind speed dominates during HSSs. Interestingly, 
in 1994 and 2003 all three terms are high during CMEs indicating that in these years 
CMEs carried strong magnetic helds and were very fast. 

5.1. Relation to CMEs and HSSs 

The ICA spatial modes in Figure 9 have a quite similar latitudinal patterns as the 
average distributions of the Ahs and IHVg indices during CMEs and HSSs depicted in 
Figure 5. This suggests that the ICl(Ah) and ICl(IHV) represent the CME contributions 
to these indices while the second ICs represent the HSS contributions. The hrst ICs 
correlate well with the CME fraction (cc(ICl(Ah)) = 0.76, p = 6.3 ■ 10“^°; cc(ICl(IHV)) 
= 0.81, p = 5.6 ■ 10“^^) and the second ICs with the HSS fraction (cc(IC2(Ah)) = 0.73, 
p = 7.8 ■ 10“®; cc(IC2(IHV)) = 0.74, p = 4.5 ■ 10“®). However, it is not physical that the 
annual fractions of CMEs and HSSs in solar wind should determine the yearly levels of 
geomagnetic activity because the properties of CMEs and HSSs evolve from one year to 
another. For example, as shown clearly in Fig. 10, the speeds and magnetic held strengths 
of CMEs and HSSs are different in different years. To take the varying properties of CMEs 
and HSSs into account, we estimate the CME and HSS contributions to global geomagnetic 
activity by calculating the quantities 


C — {Ap)cME ■ fcME 


( 11 ) 
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H = {Ap)hss ■ fnss, (12) 

where {Ap)cME {{Ap)hss) is the annual average of the Ap index values observed during 
CMEs (HSSs) and fcME ifHSs) is the annual fraction of CMEs (HSSs) in the solar wind, 
and plotting them in Figure 11. As expected, the hrst ICs (see Fig. 7) are very highly 
correlated with the CME contribution (cc(ICl(Ah)) = 0.92, p = 5.7■ 10“^®; cc(ICl(IHV)) 
= 0.93, p = 3.3 ■ 10“^°) and the second ICs (see Fig. 8) with the HSS contribution 
(cc(IC2(Ah)) = 0.88, p = 4.7 ■ 10“^®; cc(IC2(IHV)) = 0.90, p = 4.9 ■ 10“^’^). This gives 
strong evidence that the hrst and second ICs indeed represent the contribution of CMEs 
and HSSs, respectively, to geomagnetic activity. There are some small differences, e.g., 
between the second ICs and the HSS contribution, especially in 1989, when the HSS con¬ 
tribution shows a deep minimum but the second ICs only a shallow minimum. These 
differences are most likely related to the numerous gaps in the solar wind satellite mea¬ 
surements in 1980s and early 1990s, causing larger inaccuracy in solar wind classihcation 
and in the annual CME and HSS fractions at those times [Richardson and Cane, 2012]. 

Since the two ICs represent the CME and HSS contributions to geomagnetic activity, the 
corresponding IC spatial modes quantify the weights by which CMEs and HSSs contribute 
to the local geomagnetic activity at the different stations. Although the spatial modes of 
Ahg and IHVg indices (see Fig. 9) have a fairly similar latitudinal variation, the SM2(IHV) 
is at considerably higher level than SM2(Ah), indicating that the relative contribution of 
HSSs is, on an average, greater to the IHVg indices than to the Ahs indices. Furthermore, 
the hrst spatial mode of IHVg shows a very deep minimum at the poleward edge of the 
auroral oval, meaning that CMEs have a very small contribution to the IHVg indices at 
these latitudes where geomagnetic activity is dominated by HSS-driven substorm activity 
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in the night sector [Tanskanen et ai, 2005, 2011]. This is also consistent with the results 
by Finch et al. [2008] who showed that correlation between geomagnetic activity and solar 
wind speed maximizes in the night sector at auroral latitudes. On the other hand, the 
Ahg indices measure all local times and are thus not solely dominated by substorms even 
at auroral latitudes, which decreases the relative importance of HSSs in the Ahs indices. 
Because of the strong dominance of HSSs, the IHVg indices at auroral latitudes have a 
slightly higher EOF2 and a smaller EOFl (see Fig. 2), as discussed in Section 3.2. 

In order to exclude the possibility that the spatial modes obtained by the independent 
component analysis are artifacts of the method, we have htted coefficients a and (5 for 
Ahs and IHVs indices of different stations so that 


Ahs 

^AhBg I^Ah'^s 

(13) 

IHVg 

= OlIHvBs + 

(14) 


where Bg and Vg are standardized IMF strength and squared solar wind speed, respectively. 
The coefficients aAh{,OiiHv) and ^AhiPiMv) are solved using the standard least squares 
htting method. As seen in Fig. 12, coefficients of Eq. 13 have the same latitudinal 
variation as the ICA spatial modes (Eq. 8). Thus, the coefficients a Ah {(^ihv) and 13 Ah 
WiHv) obtained from the least squares fits are very similar with the first and second spatial 
mode coefficients of Ahs {kHVg) indices, respectively. The only systematic difference is 
that the (3Ah coefficients are somewhat smaller than the coefficients of SM2(Ah). The 
fact that the least squares £t calculated using the measured solar wind data produces 
very similar results with the ICA (blind to solar wind data) gives great confidence on the 
results based on ICA method and their interpretation. 
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6. Conclusions 

In this paper we have studied the spatio-temporal evolution of geomagnetic activity 
in 1966-2011 using local and IHV indices of 26 stations covering a wide range of 
latitudes. We analyzed the indices using the principal component analysis method and 
conhrmed that our recent results for the Ah indices [Holappa et ai, 2014] also hold for 
IHV indices, i.e., that the first PC describes global average geomagnetic activity and the 
second PC the deviations from the global average caused by high speed streams. 

We used the independent component analysis method to rotate the two first PCs into 
two independent components (ICs). The spatial modes of the two ICs clearly correspond 
to the distribution of the indices during CMEs (hrst mode) and HSSs (second mode). 
The two first ICs were found to match very well with the CME and HSS contributions to 
global geomagnetic activity. We also found that the hrst IC and the second IC correlate 
very highly with the IMF strength and the solar wind speed, respectively. This is due to 
the fact that high values of the IMF strength mainly dominate the (larger than average) 
driving of geomagnetic activity during CMEs while high solar wind speed dominates the 
driving during HSSs. 

We found essentially similar results both for Ah, which include all local times and for 
IHV indices, which only include the night sector. This shows that the residual Sq vari¬ 
ation in the Ah indices has no major effect to the main results. It is also very reassuring 
that the same results can be found using indices which dehne geomagnetic activity quite 
differently: the Ah being a traditional range index, the IHV index using hourly abso¬ 
lute differences. Despite all these differences between the two indices, the PC and IC 
methods are able to hnd essentially the same information about the solar wind drivers. 
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The combined PC/IC method presented here offers a new way to gain information about 
the relative occurrence of CMEs and HSSs and the long-term properties of solar wind, in 
particular the IMF strength and the solar wind speed. This improves our understanding 
of the long-term evolution of solar wind and the long-term driving of geomagnetic activity 
by the different solar wind structures. 
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# 

Station name and code 

GG lat 

GG long 

GGM lat 

GGM long 

1 

Alibag (ABG) 

18.638 

72.872 

9.52 

145.27 

2 

MBour (MBO) 

14.384 

-16.967 

20.78 

56.717 

3 

Kanoya (KNY) 

31.420 

130.882 

24.17 

202.020 

4 

Kakioka (KAK) 

36.233 

140.183 

28.78 

210.93 

5 

San Juan (SJG) 

18.382 

-66.118 

29.27 

5.02 

6 

Memambetsu (MMB) 

43.907 

144.193 

36.56 

214.56 

7 

Ghambon-la-Foret (GLF) 

48.017 

2.267 

43.67 

79.94 

8 

Irkutsk (IRT) 

52.167 

104.450 

46.78 

176.67 

9 

Belsk (BEL) 

51.837 

20.792 

47.41 

96.38 

10 

Niemegk (NGK) 

52.072 

12.675 

47.93 

89.65 

11 

Hartland (HAD) 

51.000 

-4.483 

47.99 

75.55 

12 

Wingst (WNG) 

53.743 

9.073 

50.05 

87.31 

13 

Fredericksburg (FRD) 

38.210 

-77.367 

50.07 

356.16 

14 

Eskdalemuir (ESK) 

55.317 

-3.200 

52.95 

78.22 

15 

Victoria (VIG) 

48.517 

-123.417 

54.04 

294.56 

16 

Nurmijarvi (NUR) 

60.508 

24.655 

56.69 

102.78 

17 

Lerwick (LER) 

60.133 

-1.183 

58.16 

82.11 

18 

Sitka (SIT) 

57.052 

-135.335 

59.82 

278.10 

19 

Meanook (MEA) 

54.615 

-113.347 

62.41 

303.72 

20 

Sodankyla (SOD) 

67.367 

26.633 

63.64 

108.17 

21 

Gollege (GMO) 

64.867 

-147.860 

64.88 

261.68 

22 

Abisko (ABK) 

68.358 

18.823 

65.11 

102.91 

23 

Leirvogur (LRV) 

64.183 

-21.7 

65.46 

68.57 

24 

Fort Ghurchill (FGG) 

58.786 

-94.088 

69.61 

330.03 

25 

Baker Lake (BLG) 

64.333 

-96.033 

74.59 

324.68 

26 

Thule (THL) 

77.483 

-69.167 

86.00 

36.77 


Table 1. Stations and their geographic (GG) and corrected geomagnetic (GGM) latitudes 


and longitudes. Stations are ordered according to their GGM latitudes. 
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year 

Figure 1. The first principal component of a) indices and b) IHVg indices, c) The annual 
averages of the Ap index. 
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Figure 2. Two first EOFs of a) the and b) the IHVg indices as a fnnction of corrected 


geomagnetic latitude. 
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year 


Figure 3. a-b) The second PC of the Ahs and the IHVg indices, c-d) Yearly fraction of HSSs 


and CMEs. 
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Figure 4. a) The difference between the Ahs index of FCC station and the global average of 
the Ahs indices (solid line); and the second PC scaled by the EOF2 of FCC station (dashed line), 
b) The same for IHVg indices. 
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Figure 5. Averages of the a) Ahs and b) IHVg indices dnring CMEs and HSSs. 
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Figure 6. Scatter plot of the standardized first and second PCs of the Ahs indices (denoted 
in black). The red and bine arrows represent the row vectors of the rotation matrix A on which 
the PCs are projected. 
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year 

Figure 7. a-b) The first ICs of the Ahs and the IHVg indices, c) The annnal averages of the 
IMF strength B. 
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Figure 8. a-b) The second ICs of the and the IHVg indices, c) The annual averages of 
the solar wind speed. 
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Figure 9. The spatial modes corresponding to the two ICs of a) Ahs and b) IHVg indices as 


fnnctions of corrected geomagnetic latitude. 
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Figure 10. Annual averages of the three time dependent terms in Equation 10 contributing 
to the coupling function Bv^ during a) all times b) CME c) HSS intervals. 


DRAFT 


January 16, 2015, 1:25am 


DRAFT 






























X - 36 


HOLAPPA ET AL.: INDEPENDENT COMPONENT ANALYSIS 


Figure 11. 




year 


Annual a) CME and b) HSS contributions to the Ap index. 
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Figure 12. Least squares fit coefficients a and {3 (Eqs. 13-14) for a) the A^s indices and b) 
the IHVs indices. 
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